Learning subsumption hierarchies of ontology concepts from texts
نویسندگان
چکیده
This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics for generating document text. The discovered topics form the concepts of the new ontology. Concept discovery is done in a language neutral way, using probabilistic space reduction techniques over the original term space of the corpus. Furthermore, the proposed method constructs a subsumption hierarchy of the concepts by performing conditional independence tests among pairs of latent topics, given a third one. The paper provides experimental results on the Genia and the Lonely Planet corpora from the domains of molecular biology and tourism respectively.
منابع مشابه
Prime Numbers Considered Useful: Ontology Encoding for Efficient Subsumption Testing
Multiple inheritance hierarchies are frequently used for the classification of concepts into a taxonomy, to model software by organizing classes into an inheritance hierarchy, for querying object-oriented databases, for knowledge representation, policy enforcement, and subtyping of service interfaces for safe composition and substitution. All these areas apply hierarchies and share the same con...
متن کاملLearning Relations for Terminological Ontologies from Text
The problem of learning concept hierarchies and terminological ontologies can be decomposed into two sub-tasks: concept extraction and relation learning. We describe an new approach to learn relations automatically from unstructured text corpus based on one of the probabilistic topic models, Latent Dirichlet Allocation. We first provide definition (Information Theory Principle for Concept Relat...
متن کاملImproving Bio-Ontologies Matching Using Types and Adaptive Weights
Functional annotation consists in assigning a biological function to a given protein. It is a crucial task in biology and has various impacts on many fields, including understanding cellular processes and drug designing. In order to be able to share and reuse annotations, biologists and bioinformaticians have developed structured controlled vocabularies that were first simple classifications an...
متن کاملEffective Ontology Learning : Concepts' Hierarchy Building using Plain Text Wikipedia
Ontologies stand in the heart of the Semantic Web. Nevertheless, heavyweight or formal ontologies’ engineering is being commonly judged to be a tough exercise which requires time and heavy costs. Ontology Learning is thus a solution for this exigency and an approach for the ‘knowledge acquisition bottleneck’. Since texts are massively available everywhere, making up of experts’ knowledge and th...
متن کاملVisualization of Subsumption Hierarchies in Ontologies
The RACER system is a knowledge representation system that implements description logic reasoning. It offers reasoning and evaluation services for multiple concepts (TBox) and multiple individuals (ABox) as well. The RACER system responds to taxonomy queries related to description logic. The body of the response contains information about a relational structure called a concept hierarchy or sub...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Web Intelligence and Agent Systems
دوره 8 شماره
صفحات -
تاریخ انتشار 2010